Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Extracting scientific articles from a large digital archive: BioStor and the Biodiversity Heritage Library

Identifieur interne : 000373 ( Main/Exploration ); précédent : 000372; suivant : 000374

Extracting scientific articles from a large digital archive: BioStor and the Biodiversity Heritage Library

Auteurs : Roderic Dm Page [Royaume-Uni]

Source :

RBID : PMC:3129327

Abstract

Background

The Biodiversity Heritage Library (BHL) is a large digital archive of legacy biological literature, comprising over 31 million pages scanned from books, monographs, and journals. During the digitisation process basic metadata about the scanned items is recorded, but not article-level metadata. Given that the article is the standard unit of citation, this makes it difficult to locate cited literature in BHL. Adding the ability to easily find articles in BHL would greatly enhance the value of the archive.

Description

A service was developed to locate articles in BHL based on matching article metadata to BHL metadata using approximate string matching, regular expressions, and string alignment. This article locating service is exposed as a standard OpenURL resolver on the BioStor web site http://biostor.org/openurl/. This resolver can be used on the web, or called by bibliographic tools that support OpenURL.

Conclusions

BioStor provides tools for extracting, annotating, and visualising articles from the Biodiversity Heritage Library. BioStor is available from http://biostor.org/.


Url:
DOI: 10.1186/1471-2105-12-187
PubMed: 21605356
PubMed Central: 3129327


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Extracting scientific articles from a large digital archive: BioStor and the Biodiversity Heritage Library</title>
<author>
<name sortKey="Page, Roderic Dm" sort="Page, Roderic Dm" uniqKey="Page R" first="Roderic Dm" last="Page">Roderic Dm Page</name>
<affiliation wicri:level="4">
<nlm:aff id="I1">Institute of Biodiversity, Animal Health and Comparative Medicine, College of Medical, Veterinary and Life Sciences, Graham Kerr Building, University of Glasgow, Glasgow G12 8QQ, UK</nlm:aff>
<country xml:lang="fr">Royaume-Uni</country>
<wicri:regionArea>Institute of Biodiversity, Animal Health and Comparative Medicine, College of Medical, Veterinary and Life Sciences, Graham Kerr Building, University of Glasgow, Glasgow G12 8QQ</wicri:regionArea>
<orgName type="university">Université de Glasgow</orgName>
<placeName>
<settlement type="city">Glasgow</settlement>
<region type="country">Écosse</region>
</placeName>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">21605356</idno>
<idno type="pmc">3129327</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3129327</idno>
<idno type="RBID">PMC:3129327</idno>
<idno type="doi">10.1186/1471-2105-12-187</idno>
<date when="2011">2011</date>
<idno type="wicri:Area/Pmc/Corpus">000149</idno>
<idno type="wicri:Area/Pmc/Curation">000149</idno>
<idno type="wicri:Area/Pmc/Checkpoint">000130</idno>
<idno type="wicri:Area/Ncbi/Merge">000102</idno>
<idno type="wicri:Area/Ncbi/Curation">000102</idno>
<idno type="wicri:Area/Ncbi/Checkpoint">000102</idno>
<idno type="wicri:Area/Main/Merge">000378</idno>
<idno type="wicri:Area/Main/Curation">000373</idno>
<idno type="wicri:Area/Main/Exploration">000373</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">Extracting scientific articles from a large digital archive: BioStor and the Biodiversity Heritage Library</title>
<author>
<name sortKey="Page, Roderic Dm" sort="Page, Roderic Dm" uniqKey="Page R" first="Roderic Dm" last="Page">Roderic Dm Page</name>
<affiliation wicri:level="4">
<nlm:aff id="I1">Institute of Biodiversity, Animal Health and Comparative Medicine, College of Medical, Veterinary and Life Sciences, Graham Kerr Building, University of Glasgow, Glasgow G12 8QQ, UK</nlm:aff>
<country xml:lang="fr">Royaume-Uni</country>
<wicri:regionArea>Institute of Biodiversity, Animal Health and Comparative Medicine, College of Medical, Veterinary and Life Sciences, Graham Kerr Building, University of Glasgow, Glasgow G12 8QQ</wicri:regionArea>
<orgName type="university">Université de Glasgow</orgName>
<placeName>
<settlement type="city">Glasgow</settlement>
<region type="country">Écosse</region>
</placeName>
</affiliation>
</author>
</analytic>
<series>
<title level="j">BMC Bioinformatics</title>
<idno type="eISSN">1471-2105</idno>
<imprint>
<date when="2011">2011</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<sec>
<title>Background</title>
<p>The Biodiversity Heritage Library (BHL) is a large digital archive of legacy biological literature, comprising over 31 million pages scanned from books, monographs, and journals. During the digitisation process basic metadata about the scanned items is recorded, but not article-level metadata. Given that the article is the standard unit of citation, this makes it difficult to locate cited literature in BHL. Adding the ability to easily find articles in BHL would greatly enhance the value of the archive.</p>
</sec>
<sec>
<title>Description</title>
<p>A service was developed to locate articles in BHL based on matching article metadata to BHL metadata using approximate string matching, regular expressions, and string alignment. This article locating service is exposed as a standard OpenURL resolver on the BioStor web site
<ext-link ext-link-type="uri" xlink:href="http://biostor.org/openurl/">http://biostor.org/openurl/</ext-link>
. This resolver can be used on the web, or called by bibliographic tools that support OpenURL.</p>
</sec>
<sec>
<title>Conclusions</title>
<p>BioStor provides tools for extracting, annotating, and visualising articles from the Biodiversity Heritage Library. BioStor is available from
<ext-link ext-link-type="uri" xlink:href="http://biostor.org/">http://biostor.org/</ext-link>
.</p>
</sec>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Lambert, O" uniqKey="Lambert O">O Lambert</name>
</author>
<author>
<name sortKey="Bianucci, G" uniqKey="Bianucci G">G Bianucci</name>
</author>
<author>
<name sortKey="Post, K" uniqKey="Post K">K Post</name>
</author>
<author>
<name sortKey="De Muizon, C" uniqKey="De Muizon C">C de Muizon</name>
</author>
<author>
<name sortKey="Salas Gismondi, R" uniqKey="Salas Gismondi R">R Salas-Gismondi</name>
</author>
<author>
<name sortKey="Urbina, M" uniqKey="Urbina M">M Urbina</name>
</author>
<author>
<name sortKey="Reumer, J" uniqKey="Reumer J">J Reumer</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Melville, H" uniqKey="Melville H">H Melville</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Koch, Ac" uniqKey="Koch A">AC Koch</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lambert, O" uniqKey="Lambert O">O Lambert</name>
</author>
<author>
<name sortKey="Bianucci, G" uniqKey="Bianucci G">G Bianucci</name>
</author>
<author>
<name sortKey="Post, K" uniqKey="Post K">K Post</name>
</author>
<author>
<name sortKey="De Muizon, C" uniqKey="De Muizon C">C de Muizon</name>
</author>
<author>
<name sortKey="Salas Gismondi, R" uniqKey="Salas Gismondi R">R Salas-Gismondi</name>
</author>
<author>
<name sortKey="Urbina, M" uniqKey="Urbina M">M Urbina</name>
</author>
<author>
<name sortKey="Reumer, J" uniqKey="Reumer J">J Reumer</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pilsk, S" uniqKey="Pilsk S">S Pilsk</name>
</author>
<author>
<name sortKey="Person, M" uniqKey="Person M">M Person</name>
</author>
<author>
<name sortKey="Deveer, J" uniqKey="Deveer J">J Deveer</name>
</author>
<author>
<name sortKey="Furfey, J" uniqKey="Furfey J">J Furfey</name>
</author>
<author>
<name sortKey="Kalfatovic, M" uniqKey="Kalfatovic M">M Kalfatovic</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Cameron, Rd" uniqKey="Cameron R">RD Cameron</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Evenhuis, Nl" uniqKey="Evenhuis N">NL Evenhuis</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Alexander, Cp" uniqKey="Alexander C">CP Alexander</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Michaelsen, W" uniqKey="Michaelsen W">W Michaelsen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lynch, Jd" uniqKey="Lynch J">JD Lynch</name>
</author>
<author>
<name sortKey="Ruiz Carranza, Pm" uniqKey="Ruiz Carranza P">PM Ruíz-Carranza</name>
</author>
<author>
<name sortKey="Ardila Robayo, Mc" uniqKey="Ardila Robayo M">MC Ardila-Robayo</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wei, Q" uniqKey="Wei Q">Q Wei</name>
</author>
<author>
<name sortKey="Heidorn, Pb" uniqKey="Heidorn P">PB Heidorn</name>
</author>
<author>
<name sortKey="Freeland, C" uniqKey="Freeland C">C Freeland</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Holthuis, Lb" uniqKey="Holthuis L">LB Holthuis</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Schevill, We" uniqKey="Schevill W">WE Schevill</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Schevill, We" uniqKey="Schevill W">WE Schevill</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Page, Rdm" uniqKey="Page R">RDM Page</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="De Sompel, Hv" uniqKey="De Sompel H">HV de Sompel</name>
</author>
<author>
<name sortKey="Beit Arie, O" uniqKey="Beit Arie O">O Beit-Arie</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Page, Rdm" uniqKey="Page R">RDM Page</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Smith, Tf" uniqKey="Smith T">TF Smith</name>
</author>
<author>
<name sortKey="Waterman, Ms" uniqKey="Waterman M">MS Waterman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Holt, Ewl" uniqKey="Holt E">EWL Holt</name>
</author>
<author>
<name sortKey="Tattersall, Wm" uniqKey="Tattersall W">WM Tattersall</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Von Ahn, L" uniqKey="Von Ahn L">L von Ahn</name>
</author>
<author>
<name sortKey="Maurer, B" uniqKey="Maurer B">B Maurer</name>
</author>
<author>
<name sortKey="Mcmillen, C" uniqKey="Mcmillen C">C McMillen</name>
</author>
<author>
<name sortKey="Abraham, D" uniqKey="Abraham D">D Abraham</name>
</author>
<author>
<name sortKey="Blum, M" uniqKey="Blum M">M Blum</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Van Nieukerken, Ej" uniqKey="Van Nieukerken E">EJ van Nieukerken</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Raselimanana, Ap" uniqKey="Raselimanana A">AP Raselimanana</name>
</author>
<author>
<name sortKey="Raxworthy, Cj" uniqKey="Raxworthy C">CJ Raxworthy</name>
</author>
<author>
<name sortKey="Nussbaum, Ra" uniqKey="Nussbaum R">RA Nussbaum</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Henning, V" uniqKey="Henning V">V Henning</name>
</author>
<author>
<name sortKey="Reichelt, J" uniqKey="Reichelt J">J Reichelt</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Feitelson, Dg" uniqKey="Feitelson D">DG Feitelson</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lu, X" uniqKey="Lu X">X Lu</name>
</author>
<author>
<name sortKey="Kahle, B" uniqKey="Kahle B">B Kahle</name>
</author>
<author>
<name sortKey="Wang, Jz" uniqKey="Wang J">JZ Wang</name>
</author>
<author>
<name sortKey="Giles, Cl" uniqKey="Giles C">CL Giles</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lawrence, S" uniqKey="Lawrence S">S Lawrence</name>
</author>
<author>
<name sortKey="Giles, Cl" uniqKey="Giles C">CL Giles</name>
</author>
<author>
<name sortKey="Bollacker, K" uniqKey="Bollacker K">K Bollacker</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Councill, Ig" uniqKey="Councill I">IG Councill</name>
</author>
<author>
<name sortKey="Li, H" uniqKey="Li H">H Li</name>
</author>
<author>
<name sortKey="Zhuang, Z" uniqKey="Zhuang Z">Z Zhuang</name>
</author>
<author>
<name sortKey="Debnath, S" uniqKey="Debnath S">S Debnath</name>
</author>
<author>
<name sortKey="Bolelli, L" uniqKey="Bolelli L">L Bolelli</name>
</author>
<author>
<name sortKey="Lee, Wc" uniqKey="Lee W">WC Lee</name>
</author>
<author>
<name sortKey="Sivasubramaniam, A" uniqKey="Sivasubramaniam A">A Sivasubramaniam</name>
</author>
<author>
<name sortKey="Giles, Cl" uniqKey="Giles C">CL Giles</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pocock, Ri" uniqKey="Pocock R">RI Pocock</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<affiliations>
<list>
<country>
<li>Royaume-Uni</li>
</country>
<region>
<li>Écosse</li>
</region>
<settlement>
<li>Glasgow</li>
</settlement>
<orgName>
<li>Université de Glasgow</li>
</orgName>
</list>
<tree>
<country name="Royaume-Uni">
<region name="Écosse">
<name sortKey="Page, Roderic Dm" sort="Page, Roderic Dm" uniqKey="Page R" first="Roderic Dm" last="Page">Roderic Dm Page</name>
</region>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000373 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000373 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     PMC:3129327
   |texte=   Extracting scientific articles from a large digital archive: BioStor and the Biodiversity Heritage Library
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Main/Exploration/RBID.i   -Sk "pubmed:21605356" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd   \
       | NlmPubMed2Wicri -a OcrV1 

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024